Construction of Semantic Collocation Bank Based on Semantic Dependency Parsing

نویسندگان

  • Shijun Liu
  • Yanqiu Shao
  • Yu Ding
  • Lijuan Zheng
چکیده

Collocation has always been an important issue in language research, especially in Chinese language researches. Chinese is an isolated language, which lacks morphological changes.Establishing a relatively complete dictionary of Chinese collocation will be a great contribution to Chinese study and research. Collocation plays a significant supporting role in many fields of NLP, such as information retrieval, machine translation, information extraction, and so on. Ding and Bai proposed a method of query expansion based on local co-occurrence [1] ; Lin put relationship ofcollocation into language model for query expansion, which got over the deficiency of insufficient relationShips caused by lacking context in tradition query [2] . In the basic research field of NLP, such as syntax, semantics, etc., collocation also plays an important role.Based on the comparison of different patterns in adjective collocation between the Chinese English learners and native speakers, Zhang analyzed the typical characteristics of different learners when using adjective collocations [3] ; Xingemphasized on the importance of collocation in the second language learning [4] . The early research of automatic collocation extraction was made by Choueka, Klein and Neuwtiz,they defined collocation as adjacent words, and used co-occurrence frequency to extract collocation [6] ;Church and Hanks improved the automatic extraction technology and put forward mutual information as the index ofcollocationevaluating [7] .By proposing a formula for calculating strengthbetweencollocation,introducing dispersion formula,as well as integrating with the automatic speech tagging technology, the Xtract system of Smadja improved the extraction accuracy rate of collocation extraction up to 80% [8] ; Lin extracted collocation based on shallow syntactic parsing [9] ;Shouxun YANG applied the method of decision tree to extract collocation by integrating frequency, likelihood ratio, point mutual information, variance and other statistical indicators [10] . In China, there werea number of outstanding dictionaries had been published, PACLIC 29

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

برچسب‌زنی خودکار نقش‌های معنایی در جملات فارسی به کمک درخت‌های وابستگی

Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...

متن کامل

Feature Engineering in Persian Dependency Parser

Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...

متن کامل

The Impact of L2 Semantic Tasks (L2 Collocation versus L2 Definition) on Iranian Intermediate EFL Learners’ Vocabulary Achievement

This study investigated the relationship between teaching L2 semantic tasks (collocation vs. definition) in vocabulary achievement of Iranian intermediate EFL learners. To this end, 60 students at intermediate level studying in the Simin Institute were selected from a total number of 100 participants based on their performance on Oxford Placement Test. After ensuring the criterion of homogeneit...

متن کامل

Applying Collocation Segmentation to the ACL Anthology Reference Corpus

Collocation is a well-known linguistic phenomenon which has a long history of research and use. In this study I employ collocation segmentation to extract terms from the large and complex ACL Anthology Reference Corpus, and also briefly research and describe the history of the ACL. The results of the study show that until 1986, the most significant terms were related to formal/rule based method...

متن کامل

Domain Specific Automatic Question Generation from Text

The goal of my doctoral thesis is to automatically generate interrogative sentences from descriptive sentences of Turkish biology text. We employ syntactic and semantic approaches to parse descriptive sentences. Syntactic and semantic approaches utilize syntactic (constituent or dependency) parsing and semantic role labeling systems respectively. After parsing step, question statements whose an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015